AITopics | fs method

Collaborating Authors

fs method

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Efficient High-Order Interaction-Aware Feature Selection Based on Conditional Mutual Information

Alexander Shishkin, Anastasia Bezzubtseva, Alexey Drutsa, Ilia Shishkov, Ekaterina Gladkikh, Gleb Gusev, Pavel Serdyukov

Neural Information Processing SystemsApr-22-2026, 08:41:00 GMT

This study introduces a novel feature selection approach CMICOT, which is a further evolution of filter methods with sequential forward selection (SFS) whose scoring functions are based on conditional mutual information (MI). We state and study a novel saddle point (max-min) optimization problem to build a scoring function that is able to identify joint interactions between several features. This method fills the gap of MI-based SFS techniques with high-order dependencies. In this high-dimensional case, the estimation of MI has prohibitively high sample complexity. We mitigate this cost using a greedy approximation and binary representatives what makes our technique able to be effectively used. The superiority of our approach is demonstrated by comparison with recently proposed interactionaware filters and several interaction-agnostic state-of-the-art ones on ten publicly available benchmark datasets.

artificial intelligence, interaction, machine learning, (17 more...)

Neural Information Processing Systems

Country: Europe (0.46)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

4d689f0f30199661a10aa2200488aebb-Paper-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 14:16:08 GMT

artificial intelligence, dataset, machine learning, (15 more...)

Neural Information Processing Systems

Country:

Asia > China > Shaanxi Province > Xi'an (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

ROOFS: RObust biOmarker Feature Selection

Bakhmach, Anastasiia, Dufossé, Paul, Vaglio, Andrea, Monville, Florence, Greillier, Laurent, Barlési, Fabrice, Benzekry, Sébastien

arXiv.org Machine LearningJan-9-2026

Feature selection (FS) is essential for biomarker discovery and in the analysis of biomedical datasets. However, challenges such as high-dimensional feature space, low sample size, multicollinearity, and missing values make FS non-trivial. Moreover, FS performances vary across datasets and predictive tasks. We propose roofs, a Python package available at https://gitlab.inria.fr/compo/roofs, designed to help researchers in the choice of FS method adapted to their problem. Roofs benchmarks multiple FS methods on the user's data and generates reports that summarize a comprehensive set of evaluation metrics, including downstream predictive performance estimated using optimism correction, stability, reliability of individual features, and true positive and false positive rates assessed on semi-synthetic data with a simulated outcome. We demonstrate the utility of roofs on data from the PIONeeR clinical trial, aimed at identifying predictors of resistance to anti-PD-(L)1 immunotherapy in lung cancer. The PIONeeR dataset contained 374 multi-source blood and tumor biomarkers from 435 patients. A reduced subset of 214 features was obtained through iterative variance inflation factor pre-filtering. Of the 34 FS methods gathered in roofs, we evaluated 23 in combination with 11 classifiers (253 models in total) and identified a filter based on the union of Benjamini-Hochberg false discovery rate-adjusted p-values from t-test and logistic regression as the optimal approach, outperforming other methods including the widely used LASSO. We conclude that comprehensive benchmarking with roofs has the potential to improve the robustness and reproducibility of FS discoveries and increase the translational value of clinical models.

artificial intelligence, dataset, machine learning, (19 more...)

arXiv.org Machine Learning

2601.05151

Country:

Europe > France (0.29)
Europe > United Kingdom > England (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology > Lung Cancer (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Semi-Supervised Federated Multi-Label Feature Selection with Fuzzy Information Measures

Mahanipour, Afsaneh, Khamfroush, Hana

arXiv.org Machine LearningNov-25-2025

Multi-label feature selection (FS) reduces the dimensionality of multi-label data by removing irrelevant, noisy, and redundant features, thereby boosting the performance of multi-label learning models. However, existing methods typically require centralized data, which makes them unsuitable for distributed and federated environments where each device/client holds its own local dataset. Additionally, federated methods often assume that clients have labeled data, which is unrealistic in cases where clients lack the expertise or resources to label task-specific data. To address these challenges, we propose a Semi-Supervised Federated Multi-Label Feature Selection method, called SSFMLFS, where clients hold only unlabeled data, while the server has limited labeled data. SSFMLFS adapts fuzzy information theory to a federated setting, where clients compute fuzzy similarity matrices and transmit them to the server, which then calculates feature redundancy and feature-label relevancy degrees. A feature graph is constructed by modeling features as vertices, assigning relevancy and redundancy degrees as vertex weights and edge weights, respectively. PageRank is then applied to rank the features by importance. Extensive experiments on five real-world datasets from various domains, including biology, images, music, and text, demonstrate that SSFMLFS outperforms other federated and centralized supervised and semi-supervised approaches in terms of three different evaluation metrics in non-IID data distribution setting.

dataset, feature selection, selection, (12 more...)

arXiv.org Machine Learning

2511.17796

Country: North America > United States > Kentucky (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.48)

Add feedback

P-CAFE: Personalized Cost-Aware Incremental Feature Selection For Electronic Health Records

Kashani, Naama, Cohen, Mira, Shaham, Uri

arXiv.org Artificial IntelligenceOct-27-2025

Electronic Health Records (EHRs) serve as comprehensive digital repositories of patient health information, encompassing both structured and unstructured data (Bates et al., 2014). A thorough understanding of EHR data can significantly enhance various aspects of patient care, including disease prediction, healthcare quality improvement, and resource allocation (Shickel et al., 2018; Kim et al., 2019). However, EHR data presents unique challenges: it is often high-dimensional, multimodal, sparse, and temporal (Wu et al., 2010; Menachemi and Collum, 2011; Xiao et al., 2018). Records typically include a diverse array of modalities, such as demographics, diagnoses, procedures, medications, prescriptions, radiological images, clinical notes, and laboratory results. The data is inherently sparse, as medical events occur irregularly, and sequential, as patient histories accumulate over time. To address these complexities, many approaches employ feature selection (FS) -- the process of identifying the most informative variables from high-dimensional input to improve model performance, interpretability, and robustness (Remeseiro and Bolon-Canedo, 2019; Chandrashekar and Sahin, 2014). Y et, to the best of our knowledge, existing FS methods applied to EHRs either ignore multimodality or fail to capture temporal dynamics.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.3233/FAIA251211

2508.08646

Country: North America > United States (0.28)

Genre:

Research Report (1.00)
Workflow (0.93)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Health Care Technology > Medical Record (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

4d689f0f30199661a10aa2200488aebb-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 15:52:04 GMT

artificial intelligence, dataset, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > China > Shaanxi Province > Xi'an (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Semantic-Inductive Attribute Selection for Zero-Shot Learning

Herrera-Aranda, Juan Jose, Gomez-Trenado, Guillermo, Herrera, Francisco, Triguero, Isaac

arXiv.org Artificial IntelligenceOct-7-2025

Zero-Shot Learning is an important paradigm within General-Purpose Artificial Intelligence Systems, particularly in those that operate in open-world scenarios where systems must adapt to new tasks dynamically. Semantic spaces play a pivotal role as they bridge seen and unseen classes, but whether human-annotated or generated by a machine learning model, they often contain noisy, redundant, or irrelevant attributes that hinder performance. To address this, we introduce a partitioning scheme that simulates unseen conditions in an inductive setting (which is the most challenging), allowing attribute relevance to be assessed without access to semantic information from unseen classes. Within this framework, we study two complementary feature-selection strategies and assess their generalisation. The first adapts embedded feature selection to the particular demands of ZSL, turning model-driven rankings into meaningful semantic pruning; the second leverages evolutionary computation to directly explore the space of attribute subsets more broadly. Experiments on five benchmark datasets (A WA2, CUB, SUN, aPY, FLO) show that both methods consistently improve accuracy on unseen classes by reducing redundancy, but in complementary ways: RFS is efficient and competitive though dependent on critical hyperparameters, whereas GA is more costly yet explores the search space more broadly and avoids such dependence. These results confirm that semantic spaces are inherently redundant and highlight the proposed partitioning scheme as an effective tool to refine them under inductive conditions.

evolutionary algorithm, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2510.0326

Country: Europe (0.28)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.87)

Add feedback

IEFS-GMB: Gradient Memory Bank-Guided Feature Selection Based on Information Entropy for EEG Classification of Neurological Disorders

Zhang, Liang, Dong, Hanyang, Gao, Jia-Hong, Sun, Yi, Xiao, Kuntao, Yang, Wanli, Lv, Zhao, Sheng, Shurong

arXiv.org Artificial IntelligenceSep-22-2025

These authors contribute equally to this work. Abstract Deep learning-based EEG classification plays a pivotal role in the automated detection of neurological disorders, offering significant advantages in diagnostic accuracy and early intervention for personalized clinical treatment. However, the performance of such classification approaches is fundamentally limited by the intrinsic low signal-to-noise ratio characteristic of EEG signals. Consequently, feature selection (FS) is essential in optimizing the EEG representations derived from neural network encoders, thereby enhancing the overall efficacy of EEG classification frameworks. Currently, few FS methods have been tailored for EEG neurological diagnosis, and most FS methods from other fields are designed for specific network architectures and lack clarity in interpretation, which restricts their direct utility in EEG classification. These authors contribute equally to this work. Consequently, these approaches may lack the robustness necessary to effectively handle data variability. To address these challenges, we introduce IEFS-GMB, a novel I nformation Entropy-based F eature Selection approach guided by a Gradient Memory Bank. This method begins by establishing a dynamic gradient memory bank that archives the sampled gradients from previous training iterations.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2509.15259

Country: Asia > China (0.30)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Comparative Study of Feature Selection in Tsetlin Machines

Halenka, Vojtech, Granmo, Ole-Christoffer, Jiao, Lei, Andersen, Per-Arne

arXiv.org Artificial IntelligenceAug-12-2025

Feature Selection (FS) is crucial for improving model interpretability, reducing complexity, and sometimes for enhancing accuracy. The recently introduced Tsetlin machine (TM) offers interpretable clause-based learning, but lacks established tools for estimating feature importance. In this paper, we adapt and evaluate a range of FS techniques for TMs, including classical filter and embedded methods as well as post-hoc explanation methods originally developed for neural networks (e.g., SHAP and LIME) and a novel family of embedded scorers derived from TM clause weights and Tsetlin automaton (TA) states. We benchmark all methods across 12 datasets, using evaluation protocols, like Remove and Retrain (ROAR) strategy and Remove and Debias (ROAD), to assess causal impact. Our results show that TM-internal scorers not only perform competitively but also exploit the interpretability of clauses to reveal interacting feature patterns. Simpler TM-specific scorers achieve similar accuracy retention at a fraction of the computational cost. This study establishes the first comprehensive baseline for FS in TM and paves the way for developing specialized TM-specific interpretability techniques.

artificial intelligence, dataset, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2508.06991

Genre: Research Report > New Finding (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)

Add feedback

Improving Audio Classification by Transitioning from Zero- to Few-Shot

Taylor, James, Mack, Wolfgang

arXiv.org Artificial IntelligenceJul-29-2025

State-of-the-art audio classification often employs a zero-shot approach, which involves comparing audio embeddings with embeddings from text describing the respective audio class. These embeddings are usually generated by neural networks trained through contrastive learning to align audio and text representations. Identifying the optimal text description for an audio class is challenging, particularly when the class comprises a wide variety of sounds. This paper examines few-shot methods designed to improve classification accuracy beyond the zero-shot approach. Specifically, audio embeddings are grouped by class and processed to replace the inherently noisy text embeddings. Our results demonstrate that few-shot classification typically outperforms the zero-shot baseline.

classification, large language model, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2507.20036

Country: Europe > United Kingdom (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.78)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback